Simpson’s Paradox in the interpretation of “leaky pipeline” data
نویسندگان
چکیده
The traditional ‘leaky pipeline’ plots are widely used to inform gender equality policy and practice. Herein, we demonstrate how a statistical phenomenon known as Simpson’s paradox can obscure trends in gender ‘leaky pipeline’ plots. Our approach has been to use Excel spreadsheets to generate hypothetical ‘leaky pipeline’ plots of gender inequality within an organisation. The principal factors, which make up these hypothetical plots, can be input into the model so that a range of potential situations can be modelled. How the individual principal factors are then reflected in ‘leaky pipeline’ plots is shown. We find that the effect of Simpson’s paradox on leaky pipeline plots can be simply and clearly illustrated with the use of hypothetical modelling and our study augments the findings in other statistical reports of Simpson’s paradox in clinical trial data and in gender inequality data. The findings in this paper, however, are presented in a way, which makes the paradox accessible to a wide range of people.
منابع مشابه
Computational Social Scientist Beware: Simpson's Paradox in Behavioral Data
Observational data about human behavior is often heterogeneous, i.e., generated by subgroups within the population under study that vary in size and behavior. Heterogeneity predisposes analysis to Simpson’s paradox, whereby the trends observed in data that has been aggregated over the entire population may be substantially different from those of the underlying subgroups. I illustrate Simpson’s...
متن کاملIntegrating Bayesian Networks and Simpson’s Paradox in Data Mining
This paper proposes to integrate two very different kinds of methods for data mining, namely the construction of Bayesian networks from data and the detection of occurrences of Simpson’s paradox. The former aims at discovering potentially causal knowledge in the data, whilst the latter aims at detecting surprising patterns in the data. By integrating these two kinds of methods we can hopefully ...
متن کاملSimpson’s Paradox – A Survey of Past, Present and Future Research
Simpson’s paradox refers to the reversal of a statistical relationship between two variables in sub-populations when the sub-populations are combined and analyzed as a population. This article is intended to provide a broad survey of the past, present and future research surrounding the issue. Real data from a discrimination litigation case is examined to identify the occurrence of the paradox....
متن کاملHow Likely is Simpson's Paradox in Path Models?
Simpson’s paradox is a phenomenon arising from multivariate statistical analyses that often leads to paradoxical conclusions; in the field of e-collaboration as well as many other fields where multivariate methods are employed. We derive a general inequality for the occurrence of Simpson’s paradox in path models with or without latent variables. The inequality is then used to estimate the proba...
متن کاملHow Likely is Simpson’s Paradox?
What proportion of all 2× 2× 2 contingency tables exhibit Simpson’s Paradox? An approximate answer is obtained for large sample sizes and extended to 2×2×l tables. Several conditional probabilities of the occurrence of Simpson’s Paradox are also derived. Given that the observed cell frequencies satisfy a Simpson reversal, the posterior probability that the population parameters satisfy the same...
متن کامل